Computer-aided detection of arrhythmogenic sites in post-ischemic ventricular tachycardia

Nowadays, catheter-based ablation in patients with post-ischemic ventricular tachycardia (VT) is performed in arrhythmogenic sites identified by electrophysiologists by visual inspection during electroanatomic mapping. This work aims to present the development of machine learning tools aiming at supporting clinicians in the identification of arrhythmogenic sites by exploiting innovative features that belong to different domains. This study included 1584 bipolar electrograms from nine patients affected by post-ischemic VT. Different features were extracted in the time, time scale, frequency, and spatial domains and used to train different supervised classifiers. Classification results showed high performance, revealing robustness across the different classifiers in terms of accuracy, true positive, and false positive rates. The combination of multi-domain features with the ensemble tree is the most effective solution, exhibiting accuracies above 93% in the 10-time 10-fold cross-validation and 84% in the leave-one-subject-out validation. Results confirmed the effectiveness of the proposed features and their potential use in a computer-aided system for the detection of arrhythmogenic sites. This work demonstrates for the first time the usefulness of supervised machine learning for the detection of arrhythmogenic sites in post-ischemic VT patients, thus enabling the development of computer-aided systems to reduce operator dependence and errors, thereby possibly improving clinical outcomes.


Materials and methods
Dataset. A dataset collected from nine patients (78% male, age: 66 ± 10 years old, ejection fraction: 29.4% ± 5.8%) with post-ischemic VT between 2017 and 2018 at the San Francesco Hospital (Nuoro, Italy) was created. A retrospective study on anonymized patient signals was approved by the Independent Ethical Committee of the Azienda Tutela Salute, Sardegna (Prot. n. 351/2021/CE, date of approval: 13/07/2021) and performed following the principles outlined in the 1975 Helsinki Declaration, as revised in 2000. All the participants provided their signed informed consent. The CARTO 3 system (Biosense Webster, Inc., Diamond Bar, California) was used to perform the recordings in sinus rhythm, during left ventricle (LV) EA mapping. Subsequently, radiofrequency ablation was performed according to the current guidelines. Bipolar EGMs were recorded using PentaRay (Biosense Webster, Inc., Diamond Bar, California) 2-6-2 mm, ThermoCool SmartTouch and ThermoCool SmartTouch SF (Biosense Webster, Inc., Diamond Bar, California). The CARTO 3 system sampled all signals at 1 kHz and performed band-pass filtering between 16 and 500 Hz. Even though the length of each exported recording was 2.5 s, for each EGM, a segment around the last cardiac cycle identified by the reference annotation (i.e., a fiducial point in the cardiac cycle defined according to the ventricular activity) was considered to include only acquisitions performed when the catheter was in contact with the endocardium. All EGM segments were manually annotated as physiological or AVPs by an experienced electrophysiologist blinded to the case and to the position of the potentials in the EA map with the use of an ad-hoc MATLAB graphical user interface that was implemented for this purpose in previous studies 49,50 . It should be also pointed out that all EGMs without pathological deflections (i.e., exhibiting a "normal" behaviour) belonging not only to healthy tissue (i.e., from undamaged ventricular areas), but also to border-zone and scar substrates, were annotated as physiological.
We chose to perform a blind annotation to the EGM localization on the EA map, so that the clinician may recognize the different potentials without any influence from their spatial localization, but solely looking at the amplitude and morphology of the examined EGM. In fact, the final objective of the annotation process was the identification of EGMs with normal and abnormal behaviors, being the AVPs correlated with arrhythmogenic sites. Surely, the spatial localization must be taken into account when the ablation targets should be identified, which however is one step further, and beyond the scope of the present work. The dataset consisted of 1584 EGMs, including only exported signals that were spatially projected in the reconstructed LV EA map with a projection distance below 8 mm, for a correct localization of the EGM on the EA map. This threshold was selected by considering the recommendations provided by the CARTO 3 system technical documentation 51 . There, 8 mm is the threshold adopted to graphically warn, by means of blue markers, the operator about EA points with less accurate spatial localization. Clearly, the lower the projection distance, the more reliable is the EA point anatomical localization. Among all EGMs, 618 (i.e., 39%) AVPs were identified, along with 966 (i.e., 61%) physiological potentials. As the VT arrhythmogenic areas are typically restricted when compared with healthy myocardial tissue, the dataset composition was intrinsically unbalanced toward the physiological signal class.
Specifically, different types of potentials fell into the AVP class, mainly enclosing endocardial bipolar deflections spreading after the end of the corresponding surface QRS depolarization, those starting during the corresponding surface QRS depolarization but vanishing after its end, and early EGM deflections completely falling  50,52 . Some examples of EGMs included in the adopted dataset are shown in Fig. 1 with their relative spatial location on the voltage EA map.
Feature extraction. In light of our prior investigations [48][49][50] , multiple quantitative features that characterize the EGM were extracted and used to train supervised classification models to distinguish pathological from physiological potentials. Four types of features were identified, which were obtained from the time, frequency, time-scale, and spatial domains, as detailed hereinafter.
Features from time domain. In the time domain, two features were extracted for each bipolar EGM segment: the peak-to-peak amplitude (A pp ) and a fragmentation measure. The A pp has been largely exploited in the scientific literature 7,14,53,54 and as such, it was introduced to enable the differentiation between high voltage EGMs (A pp > 1.5 mV), which are typically associated with healthy myocardium, and signals originating from scar and border zone regions, which are generally characterized by lower voltages (i.e., A pp < 0.5 mV and 1.5 mV < A pp < 0.5 mV, respectively) 53 . To determine the A pp , we analyzed each EGM within a 350-ms window around the reference annotation (i.e., 50 ms before and 300 ms after), and computed the difference between the largest EGM positive deflection (A p + ) and the absolute value of the largest negative deflection (A p − ) from the isoelectric within that window, as: The fragmentation measure was adopted because fractionated potentials indicate slow and inhomogeneous conduction of the electrical activation locally 55,56 . Even though it has not been analytically described, this measure has been frequently used in the literature 22,31,33,35,57,58 . In this study, fragmentation was defined as the total number of peaks in the 350-ms long EGM segment, as in 48,49 . To do this, the 350-ms EGM segment was rectified by extracting its absolute value, and all the peaks exceeding a given amplitude threshold were counted. In this computation, the threshold T was identified as 75% of the mean absolute deviation (mad) of the rectified EGM, as: Features from time-scale domain. Two features have been also computed in the time-scale domain using continuous wavelet transform (CWT), which guarantees a good resolution both in time and in frequency domains. At first, a preliminary undecimated wavelet denoising 48 was performed, by considering a 2-level decomposition, the Daubechies-2 mother wavelet, the Universal threshold ( θ j ) and the soft thresholding of the detail coefficients (cD), as: www.nature.com/scientificreports/ where cD j,k indicates the k-th detail coefficient at level j, while σ j was defined as: Undecimated wavelet transform was used to achieve translation invariance, as the morphological analysis is affected by the loss of this property in conventional decimated schemes. This denoising stage, which is applied to all the spectral contents from about 125-500 Hz (according to the chosen level of decomposition), was aimed at reducing small noisy fluctuations of the EGMs before computing the features in the time-scale domain. Then, the CWT was performed with Daubechies-2 mother wavelet and scales from 2 to 35, and the sum and the standard deviation of the average powers associated with the five most powerful scales were computed across the scales of the CWT decomposition and considered as features. Remarkably, although on the overall dataset the five most powerful scales range from 31 to 35 (which respectively correspond to about 22.2, 21.5, 20.8, 20.2 and 19.6 Hz), the identification of the five most powerful scales was performed independently for each EGM. As such, those scales vary from EGM to EGM independently, according to their spectral content.
Features from frequency domain. On the basis of recent evidence 50 , several spectral signatures of EGMs were considered, including the relative power in several frequency sub-bands and other metrics relying on the morphology of the EGM spectrum.
Specifically, after the power spectral density (PSD) computation, the relative power content of each EGM was considered, according to a sub-band analysis approach with a 20 Hz granularity. Specifically, in each sub-band, the relative power was determined as the percentage ratio between the absolute power contained in that sub-band and the overall power content up to 320 Hz, according to a previous study 50 . Indeed, relative powers allowed eliminating the influence of the signal amplitude, which conversely is already accounted for in the time domain features. Relative power showed to be useful in distinguishing between AVPs and physiological potentials 59 , and to detect statistically significant differences among them in several frequency ranges 50 . Accordingly, the relative powers in the 20 Hz sub-bands between 0 and 320 Hz were considered as features. However, in this analysis, the sub-band 20-40 Hz was not included, as no statistical difference between physiological signals and AVPs was found for that sub-band so far 50 .
Furthermore, some interesting statistical differences between physiological potentials and AVPs emerged in the literature 50 also by looking at the PSD morphologies through some synthetic metrics, as the mean frequency (MNF), the peak frequency (PKF), the mean spectral power (MNP), and the power spectrum ratio (PSR). Specifically, the MNF expresses if the power spectral contents are mostly localized at higher or lower frequencies, the PKF identifies the frequency in which the PSD maximum occurs, and the MNP and the PSR quantify the average power below 320 Hz and in the range around the PKF, respectively 50,60 . As such, in this study, we included all these metrics as features in the frequency domain.
Features from spatial domain. Different features related to the spatial localization of each EGM were deduced from the EA maps to provide more insights on the electrical characteristics of the surrounding endocardial area. Arrhythmogenic sites are often located in altered conduction areas, such as border zones or dense scars, rather than randomly distributed all over the EA map. Therefore, spatial features were added to fill the information gap caused by the exclusive adoption of signal morphology based features.
Specifically, the spatial information was retrieved from two perspectives: the voltage map (substrate map) and the local activation time and signal conduction properties (LAT map).
First, we considered the circular area C P on the EA map, with a radius ǫ and centered in P, where P is the projection on the map of the considered EGM acquisition point (see Fig. 2). According to preliminary investigations (data not shown), ǫ was set to 6 mm. For each EGM, all the neighbors (other acquisition points for which an EGM is available that fall within this C P area) were considered to extract the following features. An average number of 55 neighbors (95% CI, 51-58) was identified in each C P . On the substrate map, we extracted: • The mean bipolar voltage, computed as the mean value of the peak-to-peak amplitude A pp exhibited by all EGMs in C P ; • The weighted mean bipolar voltage, evaluated as before but weighting the neighbors' contribution by the exponential function e −x , where x represents the Euclidean distance of the different points in C P from the point P; • The standard deviation of the A pp exhibited by the bipolar EGMs acquired in C P ; • The number of neighbors associated with scar, border zone, and healthy tissues. This number was normalized with respect to the total neighbors identified in C P , to provide features independent from the density of the points acquired during the EA mapping procedure.
On the LAT map, we extracted three additional spatial features for each EGM: • The LAT value, i.e., the local activation time, as provided by the CARTO 3 exported files or, if not present, by considering the CARTO 3 interpolated LAT in the nearest EA points. • The conduction velocity (CV), which was derived from the LAT information according to 61,62 , is intrinsically connected to electrophysiological characteristics of the myocardial substrate, especially in cases of (4) cD j,k = sign cD j,k cD j,k − θ j if cD j,k ≥ θ j 0 otherwise  63 and provides essential information on the speed and the direction of the wavefront propagation; • The coherence of directions (CD) in C P , which was computed as the standard deviation of the angles between the CV vector of P and those of the neighbors. This feature was introduced to express the level of randomness in the direction of the wavefront conduction (the higher the CD value, and then the standard deviation of the abovementioned angles, the more disordered the conduction).
Finally, a standardization stage was performed on all features. As such, by considering each feature separately, and its values across the different EGMs (and patients) as a distribution, the standardization was performed according to the Z-score normalization method, by subtracting the mean and dividing by the standard deviation each value of such distribution. In this way, all the features belong to zero-mean, unit-variance distributions. This pre-processing step was adopted in order to get rid of any bias related to the different range of values that the input features may assume according to their definition, thus providing less sensitivity of the models to the scale of input features.
Classification models. Three feature-based supervised classification models were trained for the recognition of AVPs and physiological potentials: support vector machine (SVM), K-nearest neighbors (KNN), and ensemble tree (ENS). First, a preliminary study for the choice of the SVM kernel and the number of neighbors for the KNN was conducted (see Supplementary Tables S1 and S2). On the basis of this study, the third-order polynomial kernel was chosen for the SVM, and 10 neighbors were considered for the KNN.
For all the three identified models, the MATLAB default parameters for the chosen classifiers were adopted. The rationale behind this choice is that parameters optimization on a limited-size dataset is prone to overfitting, and the goal of the research was mainly the assessment of the presented approach to solve the problem at stake rather than the achievement of the highest performance on the available dataset. Accordingly, we preferred a more robust classification model to the ideal classification performance by using optimization techniques or parameter tuning. Moreover, given the limited dataset size, the introduction of a validation set for parameter optimization would have further reduced test and training sets, affecting overall performance.
As regard such default parameters, for the SVM, the sequential minimal optimization routine along with a unitary penalty weight (i.e., box constraints) and no expected proportion of outliers in training data were used. For the KNN, a Euclidean neighbor searcher method with a maximum of 50 data points in the leaf node and no weighting method were imposed. For the ENS, the logit boost aggregation method with 100 ensemble learning cycles and 10 decision splits per tree (at maximum) were adopted. Feature selection. Feature selection was performed to understand if taking into account only the most relevant features may improve, worsen, or leave the classification results unchanged. The minimum redundancy maximum relevance (mRMR) feature selection method was adopted to highlight the most useful features in the set with respect to the response variable 64,65 , thus identifying the features that have highest correlation with the response but the lowest correlation among themselves. Feature selection was performed on the training set obtained in each fold and time for the 10-time 10-fold cross-validation scheme, leading to 100 assessments of Figure 2. Schematic representation of the 6-mm circular area C P determining the neighbors of the point P (in red) into the EA voltage map. As can be seen, the neighborhood is centered on the location of the EGM of interest (P) and spans for a radius equal to ǫ = 6 mm. Black dotted arrows represent the distances between each neighbor point and P. All EGM points falling within the selected area are exploited for the computation of the spatial features of P. The EA voltage map was created from CARTO 3 export files in MATLAB v2022a (MathWorks Inc., MA, USA). www.nature.com/scientificreports/ the features. Then, a common set of features was retrieved as follows. Considering the relevance score provided by the mRMR algorithm for each feature, a score vector was generated and normalized between 0 and 1 at each iteration. Subsequently, a single score vector (V s ) was obtained by summing the score for each feature, thus computing a unique relevance value. Then, V s was sorted in descending order, and the contribution of each feature, from the most to the less important one, was cumulatively summed until 80% of the total relevance was reached. At this point, the first M features composing the 80% of the total relevance were considered as selected features for investigating the entire set both in the 10-time 10-fold and in the leave-one-subject-out cross-validation. The 80% threshold was imposed because it was found to be adequate for feature reduction according to preliminary investigations (data not shown).

Performance evaluation.
To provide an objective and quantitative evaluation of the proposed classification approaches, all classification methods were trained and tested in two ways. First, a stratified 10-time 10-fold cross-validation scheme was used. This validation strategy was adopted to obtain an objective and robust evaluation of the performance by exploiting stratified partitions based on a common pool of instances coming from all the patients. It provides an unbiased result over a dataset characterized by a reduced number of instances. In this case, an equal number of instances of AVPs and physiological signals were randomly selected from the main pool to compose each fold, which accounts for 10% of the pool size. Then, nine folds were used for training and one for test, iteratively repeating training and test in order to cover all the permutations. Though this approach is conceived to reduce the bias of a single validation, a further unbiasing was pursued by repeating ten times this process. In this case, a different random selection of physiological EGMs (available in more instances than AVPs) to be inserted in the data pool was carried out, with the aim of performing an evaluation as generalizable as possible.
To provide an insight into the performance of our approach in a scenario that most resembles a real EA mapping application, a leave-one-subject-out cross-validation scheme was also explored. This approach allows for an accurate performance evaluation on EGMs from patients never seen by the classifier during the training phase, thus mimicking a real-world application in which the model would be exploited on patients never seen before, and for the evaluation of the impact due to eventual patient-specific characteristics. Each model was trained and tested nine times, each time using eight patients out of nine for training and the remaining one for testing. This latter analysis aimed to investigate if the models were possibly affected by patient-specific characteristics.
To perform an unbiased performance assessment, all models were trained and tested using the same sets of AVPs and physiological signals in each fold or patient-related partition of the dataset.
To evaluate the performance, five metrics were computed, i.e., the accuracy (ACC), the true positive rate (TPR) or sensitivity, the true negative rate (TNR) or specificity, the false positive rate (FPR), and the F1-score, as follows: where P and N represent the total number of AVPs and physiological EGMs, respectively; TP and TN are the number of AVPs and physiological potentials that were correctly identified, respectively; FP is the normal EGMs classified as AVPs; and PPV is the positive predictive value or precision, defined as: In the 10-time 10-fold cross-validation scheme the performance metrics were evaluated on the total number of TP , TN , FP , P , and N obtained by summation over the different folds at each time. Therefore, all indexes were computed for each time, and results were reported as mean and standard deviation of the values obtained across the 10 iterations, thus yielding a better representation of the performance over the whole dataset.
Similarly, in the leave-one-subject-out case, summation was exploited over the different patient-related subsets, thus collecting a single value per metric. This choice was necessary to avoid computing percentages over strongly imbalanced subsets.
All computations were performed with MATLAB v2022a (MathWorks Inc., MA, USA). Figure 3 illustrates the features selected using the mRMR-based method by reporting their cumulative relevance scores. In accordance with the proposed feature selection approach, all features up to #16 were retained, because they cumulatively reached the 80% of the total relevance (black dotted line). Table 1 reports the results of the three classifier models in the 10-time 10-fold cross-validation, both with all the features or after the feature selection. The classification performances among the three classifiers were high, similar, and stable, thus highlighting the robustness of the proposed features regardless of the adopted classification model. However, the results remained stable even when only some features were considered. Furthermore, In particular, the ENS model presented similar and balanced performance in both metrics, which were approximately 93% when no feature selection was performed, whereas they were above 91% when features were selected. The same trend was observed in the SVM, but only when no feature selection was performed. Conversely, the KNN model exhibited an imbalanced performance with higher TPR than TNR, thus allowing for a better recognition of AVPs than physiological EGMs. Table 2 gives an overview of the results in a possible real application scenario by exploiting the leave-onesubject-out cross-validation. The results remained high and stable when all features or only the selected ones were considered, and the ENS provided the highest performance. However, the TPR and TNR values when no feature selection was performed indicate that an imbalance can be observed for the ENS model, which better identified physiological potentials than AVPs (TNR = 88.1% vs. TPR = 79.4%), while the other two models  (32) relative PSD power in 0-20 Hz. Each bar represents the relevance score due to the features up to the one shown in the x-axis (i.e., the second bar corresponds to the relevance contributions of features #1 and #2 with respect to the total amount of relevance, and so on), except for the case of feature #1, in which the cumulative relevance score coincides with the relevance score assumed by this feature. Table 1. Classification performance in the 10-time 10-fold cross-validation. Results in the 10-time 10-fold cross-validation are reported as mean and standard deviation for all performance indexes and approaches, being computed across the 10 iterations cumulatively. Evaluations were performed by considering all features, i.e., without any feature selection, and including only those features selected by the mRMR algorithm.

Discussion
In this work, several features were presented and assessed to investigate their effectiveness in designing and training an automatic system that can support clinicians in AVP identification during post-ischemic VT substrate mapping procedures. All features were conceived according to the electrophysiological characteristics of AVPs and physiological potentials, and thus extracted from different domains.
The results show that the adoption of multi-domain features that aim to recognize AVPs is feasible and effective, allowing for reliable and accurate recognition in more than 93% of cases when 10-time 10-fold crossvalidation is considered (i.e., 89.4%, 88.3%, and 93.1% for SVM, KNN, and ENS models, respectively; see Table 1), and a very high rate of correctly recognized AVPs (i.e., 89.3%, 93.0%, and 93.1% for SVM, KNN, and ENS models, respectively) with low false positive alarms (i.e., 10.4%, 16.4%, and 6.9% for the SVM, KNN, and ENS models, respectively), while guaranteeing stable and precise identification (see standard deviation values in Table 1). These findings were further confirmed when the number of involved features was reduced, thus highlighting that the adopted feature selection strategy could lead to comparable and accurate performance with a reduced computational load. Similar results were obtained when a more restricted number of features (i.e., the five most relevant ones) was considered, with performance differences < 3.5% (Supplementary Table S3), thus paving the way for a light integration of the proposed approach in off-the-shelf systems.
A comparison with the classification results reported in a previous study 49 , with respect to the adoption of the time and time scale domain features only, indicates that our new features improved the recognition accuracy by about 10% on average. Their use even outperformed the adoption of an artificial neural network trained and tested on the time series of each EGM (+ 10% accuracy), thus confirming their effectiveness for AVP recognition.
When the performance on the leave-one-subject-out cross-validation was considered, which has the greatest resemblance to a real application scenario, a slight deterioration in all performance indexes can be observed, as expected, with greater differences between the accuracy and TPR values (i.e., a decrease of 10.4% and 13.4%, on average, respectively, when no feature selection is performed, and 8.8% and 9.9% when feature selection is adopted, respectively). The use of patient-related validation introduced a strong imbalance not only in terms of numerosity of AVPs and physiological potentials annotated by the cardiologist for each patient, but also in terms of EGMs annotated for each EA mapping procedure, thus producing an uneven split of the dataset into the training and testing sets. Therefore, the use of the leave-one-subject-out scheme led to models with lower generalization capabilities when applied to observations belonging to patients never seen during training, thus yielding lower performance. The SVM model showed the most relevant reduction in performance with respect to the 10-time 10-fold case. Furthermore, the difference between the classification model performances increased, so that the accuracy difference between the SVM and ENS models changed from 3.7 to 10.0% when no feature selection was performed and from 4.1 to 9.5% when feature selection was implemented (see Table 1). The same trend can also be observed in TNR values.
The literature already reported studies on feature-based approaches developed to support clinicians in the identification of VT arrhythmogenic sites, and emerging strategies to target points for VT ablation. However, a performance comparison is complex as these works present findings based on different metrics and populations, and the rationale is different. As a matter of fact, some studies aimed at the identification of cut-off thresholds on features or criteria modelled on VT and non-VT patients as control group [33][34][35]37 , which is very different from distinguishing between AVPs and physiological EGMs in the same post-ischemic VT population. Furthermore, some important works aimed at detecting ablation areas 33,41 , fractionated signals 38 , VT sites of origin 39,40 , or VT isthmuses 36 , which is not strictly the same aim of the present study. Some other studies investigated alternative targeting strategies for ablation 43,45,47 , for the choice of the ablation targets in prospective clinical studies, and their efficacy was evaluated in terms of the clinical response of the patients after ablation. In contrast with the approach presented in our study, which aims at detecting AVPs to assist clinicians in determining arrhythmogenic sites during EA mapping, these investigations actually aimed to identify ablation target points, which is beyond the scope of our work. Table 2. Classification performance in the leave-one-subject-out cross-validation. Results in the leaveone-subject-out cross-validation are provided as a single cumulative value for each performance index and approach by considering all features, i.e., without any feature selection, and including only those features selected by the mRMR algorithm in the 10-time 10-fold case. www.nature.com/scientificreports/ A retrospective analysis of the location of the arrhythmogenic sites identified by the best-performing model in the leave-one-subject-out validation scheme (i.e., the ENS classifier), with respect to the voltage map and to the sites ablated during the clinical procedure, has been carried out. Remarkably, being our study performed retrospectively, ablation points were selected regardless of the proposed computer-aided system, according to different clinical strategies.
Indeed, the relevance of the proposed computer-aided identification of AVPs can be appreciated by looking at the examples reported in the Fig. 4. Specifically, in those cases where the scar-related substrate is not characterized by complex re-entry morphologies, as the one shown in Fig. 4a, the AVPs are quite well positioned onto the map. In that reported case, the clinician identified and ablated one channel entrance/exit to the slow conduction zone (black star in Fig. 4a) and a VT isthmus (black diamond in Fig. 4a). In this map, AVPs detected by the proposed system are coherently placed in correspondence of the entrance and in the inner portion of the isthmus. Consequentially, when the VT circuits are not complex, the value of the proposed system would be in speeding up the clinical procedure and supporting the easier identification of the re-entrant circuits in the scar. Conversely, an opposite situation can be observed in case of very complex VT circuits, as the one shown in Fig. 4b. In this case, lots of AVPs were identified by our system, predominantly near ablated regions. In contrast to this, during the ablation procedure, the AVPs located in apical LV area (black octagram in Fig. 4b) have been missed. Remarkably, in the same region, several AVPs have been correctly recognized by our tool, which would have been helpful to plan the subsequent ablation, if available in real-time. Looking more deeply at the same map, we can appreciate several EGMs annotated as physiological but recognized as AVPs by the ENS (purple marbles in Fig. 4b), mainly in deep scar areas. Here, even if the ENS output did not agree with the annotation, it is interesting to note that such false AVPs are surrounded by real AVPs and ablated sites, thus possibly reflecting some pathological aspects objectively identified by the model but not by the human operator, hence without overfitting the expert's point of view. As such, by considering the typical low success rate of substrate ablation procedures during sinus rhythm (i.e., about 60-70% [18][19][20], our study is encouraging since can support clinical expert in arrhythmogenic sites identification in complex VT circuits, where the human commitment is hampered, thus highlighting arrhythmogenic sites to be considered for optimal ablation strategy and outcome.

Conclusion
This work demonstrated the feasibility of supervised machine learning tools for the identification of arrhythmogenic sites in post-ischemic VT patients, thus enabling the conception and adoption of computer-aided systems supporting clinicians and possibly improving the clinical outcome.
Different classification models for the automatic identification of AVPs in intracardiac bipolar EGMs were trained by using features from multiple domains. Classification performance revealed the robustness of the chosen features with respect to the different classification models, although some differences can be observed. High recognition capabilities are observable even when a leave-one-subject-out validation scheme is applied, which is closer to a real application scenario, albeit with a performance deterioration, particularly in terms of accuracy and TPR. Furthermore, arrhythmogenic sites identified by the best-performing model in the leaveone-subject-out validation scheme have been contextualized in the procedure in terms of location on the EA map and position of the real ablation targets, confirming the potentiality in assisting clinical experts during EA mapping. This study has two main limitations, namely, the dataset size and the single-expert annotation, which could have biased the results. As such, dataset enlargement is needed, including multiple annotators for the labeling process, to confirm the results of this study. Figure 4. Some examples of voltage maps of procedures included in this work, in which AVPs correctly identified (green marbles), EGMs labelled as physiological but recognized as AVPs (purple marbles), EGMs annotated as AVPs but recognized as physiological (light blue marbles), and endocardial sites ablated during the procedure (red marbles) are depicted. On the left map (a), a channel entrance/exit to the slow conduction zone (black star) and a VT isthmus (black diamond) are depicted, while on the right map (b), a cluster of not-ablated AVPs located in apical LV area (black octagram) is highlighted. The EA voltage maps were generated from CARTO 3 export files in MATLAB v2022a (MathWorks Inc., MA, USA). www.nature.com/scientificreports/ Despite being evaluated in a retrospective study, our findings are encouraging, suggesting the possibility of AVP automatic recognition with the proposed features, the conception of an AI-based computer-aided systems that support clinicians in VT interventional cardiology, and also possibly the improvement of clinical outcomes. On this regards, larger studies are needed to prospectively evaluate the clinical value of introducing the proposed machine learning tools in supporting clinicians in the identification of arrhythmogenic sites for VT ablation during electrophysiological procedures.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.